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Current situation 


ə Complex games/benchmarks are available and used on Linux; 
ə Drivers are getting more complex as performance improves; 


ə Users now rely on Open Source drivers for games/apps... 


Risks when merging new code 


ə Break previous functionalities / rendering; 
ə Break the performance of a game inadvertly; 


e Improve the performance of one app but slow down others. 


= Need to test and benchmark all the platforms and games of 
interest. | 
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Pin-point the change, to help bug-reporting and fixing; 
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Guarantee reproducibility of the results; 


Warn the relevant developers of changes. 


Challenges 


ə Unit tests, performance, metrics or rendering can be unstable; 
ə Multiple components interacting with each-other; 
e Avoid false positives and false negatives; 


e Impossible to test every commit. 
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Current solutions 
Unit testing: Piglit, dEQP, gl-CTS, vk-CTS, more...; 


e Performance: Phoronix Test Suite, Sixonix; 
ə Rendering: Phoronix Test Suite, Anholt’s trace-db; 
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Cl: Curt 


ə Unit testing: Piglit, dEQP, gl-CTS, vk-CTS, more...; 
e Performance: Phoronix Test Suite, Sixonix; 
ə Rendering: Phoronix Test Suite, Anholt’s trace-db; 


ə Job scheduling: Phoronix Test Suite, Jenkins, ... 


Issue: Great for reporting, not for bisecting 
ə No feedback loop to address variance issues; 


ə Environment may have changed; 


GO 


Unit tests may flip/flop; 
ə Rendering may be unstable (yes, it does happen); 


ə Solution: external runner for them to take care of this! 
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EzBench: General architecture 
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General flow graph 
Acquire Generate Enhance 
data report report 


ə Acquire data: Compile/deploy, run tests and collect data/env; 


Schedule 
enhance- 


ments 


ə Generate report: Read from the disk, create a python IR; 
ə Enhance report: Analyse the data, find changes, report events; 


ə Schedule enhancements: Request more data (bisect!). 
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EzBench: Code and license 


MIT-licensed code 
Available at https: //cgit.freedesktop.org/ezbench/ 
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@ runner: bash-based, handles: 
ə compilation and deployment of the component; 
ə setting up the environment (X, compositor); 
ə running the test. 
ə env-dump.so: LD-PRELOADed C library: 
ə dump the environments and loaded libs; 
ə hook interesting calls (GLX, EGL, GL, X); 
ə dump metrics (RAPL, GPU temperature and power usage). 
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@ runner: bash-based, handles: 
ə compilation and deployment of the component; 
ə setting up the environment (X, compositor); 
ə running the test. 
ə env-dump.so: LD-PRELOADed C library: 
ə dump the environments and loaded libs; 
ə hook interesting calls (GLX, EGL, GL, X); 
ə dump metrics (RAPL, GPU temperature and power usage). 


ə Report generation, enhancing and scheduling: python daemon; 


ə Reporting: python script generating an HTML file. 
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EzBench: Features 


ə Supports: 
ə Unit tests: Piglit, dEQP, IGT (WIP); 
ə Benchmarks: GPUTest, Unigine, GFX Bench (corporate), ...; 
ə Rendering: Apitrace. 


ə Acquires environment information, for catching changes; 
ə Analyses variance on data and reproduces changes; 


ə Auto-bisecting on data, metrics are WIP. 
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GO 


Supports: 
ə Unit tests: Piglit, dEQP, IGT (WIP); 
ə Benchmarks: GPUTest, Unigine, GFX Bench (corporate), ...; 
ə Rendering: Apitrace. 


ə Acquires environment information, for catching changes; 
e Analyses variance on data and reproduces changes; 


ə Auto-bisecting on data, metrics are WIP. 


Mesa: No limitations; 


xf86-video-intel: No limitations; 


o 


Linux: may require an external watchdog. 
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Examples of variance 
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Examples of variance 


Run histogram 4.0 
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Figure: Examples of variance 
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EzBench: Handling variance 


Student-T test 
Check if two data sets belong to the same normal distribution. 


Data Value X 


Source: http://serc.carleton.edu/introgeo/teachingwdata/Ttest.html 
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EzBench: Image comparaison 


e Contributed by Pekka Jylha-Ollila (Intel); 


ə Comparaison done using RMSE and requires 3 steps. 
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ə Contributed by Pekka Jylha-Ollila (Intel); 


ə Comparaison done using RMSE and requires 3 steps. 


Step 1: Comparing the output of 2 versions 
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EzBench: Image comparaison 


Step 2: Acquire more data and generate averages 
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Step 2: Acquire more data and generate averages 
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Step 3: Use the student-t test on the RMSEs 
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EzBench: Demo time 


Demo 1: running loads with the simple runner 
e Listing tests; 
ə Running gtkperf in different environments; 
e Showing the generated report; 


ə Start compiling a new version of mesa. 
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Demo 1: running loads with the simple runner 


GO 


Listing tests; 


GO 


Running gtkperf in different environments; 
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Showing the generated report; 


Start compiling a new version of mesa. 


Demo 2: Actual reports 


ə Auto-bisected rendering change (5k commits, 7 months); 
ə Running gtkperf in different environments; 
e Showing the generated report; 


ə Start compiling a new version of mesa. 
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EzBench: Needed features for Cl 


Randomized testing 


ə Not all tests can be run every day; 


ə Tests should be added randomly (as time permits); 
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Randomized testing 


ə Not all tests can be run every day; 


ə Tests should be added randomly (as time permits); 


Support changing multiple components at the same time 
ə EzBench needs to find the component that made the change; 


GO 


It thus needs to group data per environment; 


GO 


It needs to merge data from similar environments; 


It needs to be able to re-deploy environments; 


(>) 


ə It needs to be able to recompile important components. 
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ə Automatically annotate a git tree with: 


ə Unit test results; 
ə Power and performance results; 
ə Rendering changes. 


ə Require as little human intervention as possible; 


ə Provide reproducible results (environment). 
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Ezbench's Goals 
ə Automatically annotate a git tree with: 


ə Unit test results; 
ə Power and performance results; 
ə Rendering changes. 


ə Require as little human intervention as possible; 


ə Provide reproducible results (environment). 


EzBench tries to take care of the pitfalls of benchmarking 
ə Environment dumping and diffing; 
e Reproduces results and tries to handle variance; 


ə Is reactive to changes, and self-improving; 


ə Handles most of the testing automatically. 


Questions? 
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ə Modular architecture (profiles, tests and user hooks); 

ə Automates the acquisition of benchmark data; 

e Knows how long it is going to take; 

ə Generates a report that is usable by developers; 

e Bisects performance changes automatically; 

e Provides python bindings to acquire data and parse reports; 


ə Be crash-resistant by storing the expected goal and comparing 
it to the current state; 


ə Collect the environment information and diff it; 
ə Detect the variance and peformance changes; 


ə Automatically schedule more work to improve the report. 
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EzBench - Features 


Watchdog support; 

Handle kernel boot failures (need the watchdog); 
Add support for PTS as a backend; 

Better integrate the build process; 

React to HW events such as throttling; 

Reset the environment to a previous state; 


Integrate with patchwork to test patch series; 


Support sending emails to the authors of changes. 


